SGI Developer Toolbox 6.1

home *** CD-ROM | disk | FTP | other *** search

/ SGI Developer Toolbox 6.1 / SGI Developer Toolbox 6.1 - Disc 4.iso / documents / Motif / perfNotes

Wrap

Text File | 1994-08-02 | 28.3 KB | 728 lines

MOTIF PERFORMANCE NOTES There are a number of things that can be done to make Motif applications run as fast as possbile. This document discusses a number of them. Organization of this document is: * Introduction * General Observations * General Motif-specific Application Issues * Motif Start-up Issues * Motif Run-time Issues * Background processing * INTRODUCTION A Motif application must itself bear much of the responsibility for appearing snappy and responsive. Of course the libraries (libXm, libXt, etc) need to work as well and as fast as they can. But even if their time is reduced to zero, some (many?) apps will be less responsive than they could be. This paper has a number of suggestions about how to improve the performance of Motif applications. If you have suggestions, questions, clarifications, etc, about this paper, please send them to dave@sgi.com and they will be forwarded to the appropriate engineer. Things this paper stresses include: * An app should respond to the GUI in foreground, and do any time-consuming real work in background (i.e. workproc's, timer routines, etc). * An app should take advantage of idle time to create interface elements (e.g. popups) that are likely to be needed soon. Any GUI component should come up as instantly as possible. Foreseeable things should be created and ready to map when needed -- don't wait until they are needed to create them. * If a component may be used again, don't destroy it -- just unmap it. With a well-written application, there is (almost?) never any Motif/X11 reason for a popup or a dialog to be slow coming up -- even the first time. * GENERAL OBSERVATIONS: * A high-performance Motif program is necessarily a producer-consumer application. The GUI is the producer. It produces actions to be taken on behalf of the user. Nothing must be allowed to interfere with the producer rapidly becoming, and then remaining responsive. Even if the user cannot take a second action until the (possibly slow) current action completes, the GUI should at least continue to responsively support: * modifying the long operation (if that makes sense) * cancelling the long operation (if that makes sense) * redrawing the interface as needed (e.g. expose events) The consumer is the part of the program that carries out the actions. Consumer code should be run in background, except in those cases where the action is so fast that doing it in foreground will not adversely impact the GUI's responsiveness. At start-up time, an application has absolutely nothing more important to do than to get the initial GUI (action producer) up and usable. Anything else can be done afterwards. Typically an application should: * Get the initial minimal GUI up. There must happen very quickly. How quickly this gets done has a major impact on the user's perception of performance. * Queue up any other initialization that needs to be done. The background/workproc/consumer will do these as fast as it can. This includes such things as: * data initialization, such as a (possibly slow) database access * creation of any remaing user interface elements, such as menus and popups. The hope is that they will be ready to be mapped when needed. * Put up a wait cursor if some of the queued initialization needs to be completed before the user can interact with the window. * Start processing events. At least redraws will get handled, even if the interface needs to wait for some initialization to be fully functional. * Add to (or modify) the task queue for the consumer as indicated by user input. * High performance GUI design Design must take the toolkit into account. Motif does some things well, and some things less well. Things it does less well, and that are demonstrably important to SGI, are good candidates for SGI new widgets and/or Motif extensions. In the meantime, a good GUI design must take toolkit realities into account. Layout and precise pixels may have to give a little. An attempt to precisely achieve something that does not fit well with the toolkit is likely to cause two problems: * A more difficult that necessary development effort. * A worse than necesary performing product. * Compile -O2 * use libmalloc * Do performance measurements! (use cvspeed) Odds are high that performance of Motif and other heavily used software is less a bottle-neck than your own new, untuned code. The application also may itself benefit from tuning. This suggestion is more than theoretical -- it comes from looking at some real-world performance problems. * Note that just because Motif (or Xt) shows up high on a list does not necessarily mean that Motif itself needs to be tuned. It often means that your application is needlessly doing expensive things. For example, if you repeatedly create and destroy hundreds of widgets, or if you force unnecessary widget geometry management, Motif or Xt functions will show up high on any performance chart. The real question is "does the application really need to be doing this?" * No one ever said that developing good-looking efficient modern GUI's is simple -- the goal is a good interface for the end user, but one cost may be a certain amount of developer complexity. * GENERAL MOTIF-SPECIFIC APPLICATION ISSUES * Toolkit applications, including Motif applications, are inherently heavier than applications that use no toolkit, such as the zip editor. There is no getting around that. This makes it even more important to be careful how an application uses X and Motif. * Minimize the number of GL widgets in an application. Each GL widget has a separate GL context, and GL context switching is expensive. Having multiple GL widgets will visibly degrade your performance. * Some are concerned that deep widget hierarchies, particularly those with complex nested geometry management (form), is slow. Others are not so sure -- you may have to experiment with this one yourself. * When adding multiple items to a list, use XtSetValues on the resources XmNitems and XmNitemCount rather than calling XmListAddItem multiple times. * Use gadgets. This is an area where reasonable folks have different opinions. * Some folks say that you should use gadgets where possible such as for menu items. * Other folks say that as X has handled windows better, there is less of a win to using gadgets -- in fact in some cases performance may degrade. * Note that, because "map" deals with windows (and gadgets do not have windows), you cannot map or unmap a gadget. As a practical matter, however, this shouldn't be a significant limitation. You generally will be mapping and unmapping some sort of a shell (which may have gadgets as descendants). * Using gadgets causes extra server traffic because the widget parent must track all events for them -- even some they wouldn't need to track themselves if they were widgets. We have seen cases where lots of widgets were involved, and when gadgets were used instead there was a very noticeable speed-up. For now, probably the best advice is: * If you have lots of things that could be gadgets, it is probably worthwhile to make them gadgets. * It is generally easy to try both ways, so if in doubt do try it and see. * APPLICATION START-UP ISSUES: There are some things an application can do to minimize user-perceived start-up time: * To the extent possible, manage children all at once, rather than one by one as they are created. Doing them one by one, when they could be done all at once, causes geometry calculation time to go up non-linearly. Widgets can be managed efficiently by creating unmanaged widgets, and then using XtManageChildren. Do this instead of using XtCreateManagedWidget, or doing XtManageChild after each widget you create. * libXm updating the dropsite database is quite expensive. * Set drag and drop to dynamic. Actually, this is encouraged on functional grounds as well. Furthermore, drag and drop standardization through the X consortium is only supporting the dynamic model. * Minimize time spent by the library updating its dropsite database. You may save considerable startup time by calling XmDropSiteStartUpdate at the beginning of the widget creation code and XmDropSiteEndUpdate when done. Likewise, use these calls to bracket any code that causes substantial geometry management. In one demo, doing this gave a 70% improvement (from 10 seconds to 3 seconds for the operation). Make sure you don't forget the XmDropSiteEndUpdate, otherwise your drag and drop won't work correctly. * If your application still runs slowly, try turning off drag and drop, except where you really do need it. This should only be done as a last resort, because it interferes with consistency of the application. The user is left wondering where DND works and where it does not. However, some SGI applications have saved tens of seconds of startup time by completely turning off DND. * Get the interface up ASAP. Do things that can be delayed either in a workproc or in other background processing after the interface is up. (Workproc's are described below). Some suggestions for typical things that ought to be done that way: * Compute intensive calculations not essential to getting the initial interface up. * Time-intensive things, such as initializing access to a database. * Creating any UI not initially needed (such as dialogs and menus). Avoid just bringing up the UI and then locking out the user for a long time either. Although this is slightly better than having nothing show up, it is not much better. With proper use of polling for events and/or workproc(s), an application can generally get the time-consuming things done and still be responsive to the user. * Take proper advantage of workproc's. If there is a lot of initialization to do, and that initialization is not critical to getting the first interface up, the work should probably be done in a workproc. Doing so means that the main GUI can come up promptly without waiting for the workproc work to complete. Create UI components that will be wanted later in workproc's. If the component is already created when it is finally needed, simply mapping it on demand is very fast. For example, creating any UI not initially needed (such as dialogs and menus) can be done in a workproc -- create them, realized and managed but unmapped, ready for mapping as needed. In most cases, if the application is careful about which order its workproc's are done, the gui item should be ready before the user actually gets around to asking for it. The main code that brings up the dialog or menu needs to be something like: if (already created) map it (n.b. the code that created it will have already managed it) else set flag it is wanted And the workproc code needs a section after the item is created (and managed, but not mapped) that goes something like: if (already needed) map it * APPLICATION RUN-TIME ISSUES: Once up, Motif should not cause any serious slowdown for a well-written application. Some things to be careful of: * Use at most one GL widget in an application. Having more than one GL window in your application is very expensive because a GL context switch is quite heavy-weight. * Take proper advantage of workproc's. * Example: the interactive performance of sliders in the color chooser was terrible until the slider processing was moved to a workproc. Then the performance became just fine. * Example: if a little-used UI component is called, it may pay to at that time use a workproc to create and manage (but not map) related ones that the user is likely to need. Then they can be quickly mapped when requested. * Note: do not remain in a workproc more than a fraction of a second without ensuring that events get processed. This point is discussed in more detail below. * Reuse components, such as dialogs, wherever you can. Do not create them as needed and then immediately destroy them when done. Instead, save their ID and unmap them. They can be quickly re-mapped later when needed. Note that this suggestions is consistent with creating most UI components (other than the initial UI) in workproc's. * Wherever possible use the first of these that you can: (1) Use map/unmap to make ui components come and go. This has always been good programming practice. As of the current level of Motif, with its need to keep drag and drop information up to date, failing to do this has become much more expensive than it used to be. N.b. -- of course the component must be managed when it is first created. (2) Use manage/unmanage to make ui components come and go, only if you really cannot just map/unmap them. (Of course the component must be managed once when it is first created.) This is significantly inferior to map/unmap, but if the geometry really is changing may be necessary. If you are affecting much geometry calculation, bracket your operation with XmDropSiteStartUpdate and XmDropSiteEndUpdate. (3) Use create/destroy as a last resort. This is quite expensive. It should only be done once -- preferably at a time the user isn't waiting for it. If the user does something that needs a new UI component up, you should normally be able to just map one that was either used before, or was created anticipating this need. If the user is doing something unusual, you may need to create a new one -- but that should not be the common case. Any time you do create a component, save it if it might get used again later. It will be much cheaper to remap it, possibly setting some hew resources first, than to destroy it now and create it again later. * BACKGROUND PROCESSING An application's interface should always remain as responsive to a user as possible. This requires that: * Startup and foreground processing need to be very quick * Anything lengthy should be done in background * Any background processing must not lock out the foreground for more than a small fraction of a second * Even if the background processing must continue for a long time uninterrupted, it should at least arrange to process expose events. There are several possible ways to do background processing: * Use one or more workproc's to do processing whenever there are no pending X events. Just don't stay in the workproc more than a fraction of a second without processing events (see below). * Use XtAppAddTimeout() to call function(s) on a regular basis. You might want to do this instead of using a workproc if your background processing is more at regular intervals than continuous. * Process from the application's main-line code. In this case, the application must do its own regular event polling and dispatching. Xt is *event* driven, not *interrupt* driven. No background processing, whether it be a workproc, a timeout routine, or just ordinary code, will be interrupted by X. If the background processing really must run a long time, then it should voluntarily relinquish control from time to time to ensure that events get serviced frequently. If it does not do so, processing user input will be delayed. If you are using either workproc(s) or timeout routine(s), and if no one of them is longer than a small fraction of a second long, then event handling will be responsive without taking any special action. Keeping a application responsive, even if needs time-consuming background processing, need not be too restrictive. It just means that the background processing must be properly structured. One way to think of this is that is is the reverse of the usual -- instead of relying on Xt to call the application from time to time for a little processing, the application calls Xt from time to time to let Xt do a little processing. There are several ways this can be done: * You can use a workproc that at frequent intervals notes its own state and returns FALSE. That will let any pending events be serviced, and then the workproc will be called again. When the workproc is finally done, it needs to return TRUE so it will not get called again. Note that workproc's are called in order. If you want several workproc's to get time in any other manner, you must arrange that yourself. * You can use a timeout routine that at frequent intervals notes its own state, sets another timeout, and returns. That will let any pending events be serviced, and then when the timer goes off the routine will be called again. Note that timeout's are higher priority than workproc's. No workproc will get run unless there are no pending events and also no expired timeouts. Thus, if your timeout routine set another timeout, and if it times out before the routine exits, events will get processed but no workproc will ever get called. Doing this is one way to get events serviced during a long timeout routine. It in effect turns your timeout routine into the highest priority workproc. If you want a real workproc called before such a timeout routine is done, the timeout routine must do that itself. * In some cases, it will be difficult (or impossible) to keep track of state, exit, and be restarted. An example of this might be a lengthy recursive algorithm. In that case, the background processing needs to do its own regular polling and event dispatching. When to do the polling can be determined in any of a number of ways, such as: * algorithmically (perhaps each time through the process's main loop * after each significant section of the background task * poll from a timeout routine. If a workproc must be called, it is up to the background processing to do so itself. * OVERVIEW OF WORKPROC'S: * Xt provides a limited form of background processing, the XtWorkProc. An XtWorkProc is called when there are no events pending. An application may have more than one workproc. * Register each workproc, using XtAppAddWorkProc(). The easiest way to preserve interactive response may be to register more than one workproc. * If a workproc is to be run at any time other than the standard (in order, and only when no events are pending), the application must call the workproc(s) directly. * There is no non-blocking way to tell that a workproc has been registered. If this knowledge matters to an application, the app must keep track of it itself. * Workproc's are unrelated to X events, and are of lower priority. The only connection is that a workproc is called from the X event loop when there are no X events to service. Once a workproc is called, nothing else will run until the workproc returns. * When a workproc returns: * any pending events will be serviced * if the workproc returned FALSE, it will be run again * if the workproc returned TRUE, it will be un-registered, and the next workproc (if any) will be run. * No workproc runs until all higher priority workproc's have completed. If your application needs anything else, it will have to do its own scheduling (instead of depending on Xt). In such a case, it will not register most or all of its workproc's. It will just arrange to call them itself. * For more than one workproc, there is a defined order in which they will be called. One will not be run until all higher priority workproc's have been run to completion (i.e. returned TRUE). * For workproc's registered when not already running a workproc, the last registered is the highest priority. * workproc's registered from within a workproc are lower priority than that workproc. * You might use more than one workproc if: * Putting clearly separate tasks in distinct workproc's is the cleanest way to write the application. For example, if your initialization needs to create several dialog windows, you could register several workproc's to do so. Note that this could be the same workproc each time, perhaps with different client data. The workproc would keep track of which action to perform each time either by the different client data argument, or by keeping its own internal static records. * One long workproc task can be broken up into a number of acceptably short tasks. By putting them in separate workproc's, you save the trouble of ensuring that the workproc periodically returns to the X event loop. If you keep all in one workproc, you are responsible for preserving interactivity -- by using polling or timeouts or whatever else to ensure that the one workproc doesn't hog the cpu for too long. * POLLING FOR EVENTS Long-running background tasks should poll for, and process, events. At a minimum, the process should honor expose events. If expose events are all that is being processed, the application should put up a busy cursor. Following is a code sample that will poll for and process all events. Note that if you are in a callback and there is a pending event that would trigger the callback, you will get re-entered. If you are in a timeout routine and another timer for that routine has gone off, you will also get re-entered. Your application is responsible for handling such a case. Of course, you can avoid the problem by keeping your callbacks and timeout routines short enough that polling is not necessary. while (XtAppPending(appContext)) XtAppProcessEvent(appContext, XtIMAll); The manual pages describe these functions and their parameters in more detail. You can do such things as process only specific events. This document is only an overview. For details, read the appropriate sections from the following references: * "X WINDOW SYSTEM TOOLKIT", by Asente & Swick PART I PROGRAMMER'S GUIDE section 6.8 "Getting Events" 6.9 "Dispatching Events" 6.10 "Custom Event-Dispatching Loops" 6.11 "Background Work Procedures" 6.11 "Using Xlib Event Routines" PART II SPECIFICATION section 7.4 "Querying Event Sources" 7.5 "Dispatching Events" 7.6 "The Application Input Loop" 7.8 "Adding Background Work Procedures" * "X Toolkit Intrinsics Programming Manual, OSF/Motif 1.2 Edition", (O'Reilly Vol 4) section 9.6 "Work Procedures" * "X Toolkit Intrinsics Reference Manual", (O'Reilly Vol 5) XtWorkProc XtAppAddWorkProc() XtRemoveWorkProc() XtAppPending() XtAppNextEvent() XtAppProcessEvent() * "Motif Programming Manual" (O'Reilly Vol 6) section 20.2 "Working Dialogs" 20.2.1 "Using Work Procedures" 20.2.2 "Using Timers" 20.2.3 "Processing Events Yourself" * "The X Window System, Programming and Applications with Xt, OSF/Motif Edition", by Doug Young section 5.7 "USING WORKPROCS" * NON_TOOLKIT X PERFORMANCE NOTES * Don't needlessly complicate window clipping. For example: +----------+ | | | A +-----------+ | | | +------| B | | | +-----------+ Drawing performance to window A is reduced because it doesn't have a single rectangular clip. For toplevel windows, the way windows are overlapped is up to the user via the window manager but inside your application hierarchy, it should avoided. * If you use a GC to draw to both a pixmap and a window, you pay the price of a full GC validation every time you switch between the pixmap and the window on SGI machines (the code to draw to the graphics hardware is totally different from the dumb frame buffer hardware used for pixmaps and so the hooks in the GC must be totally revalidated each time you switch between a window and a GC). This should not be a key performance issue but if you have code like: create window A create pixmap B create GC C repeat a lot of times do some primitive to A using C do some primitive to B using C The code will be faster if you use: create window A create window B create GC C create GC D (the same as C) repeat a lot of times do some primitive to A using C do some primitive to B using D * The SGI X server does allow you to run with a default depth other than 8 (the default) using the -depth option (see Xsgi man page). Most X clients use the default visual. This means that all of their pixmaps will of the default depth. A pixmap of depth 12 takes twice the memory of a pixmap of depth 8. A pixmap of depth 24 takes four times the memory of a pixmap of depth 8. Also the rendering code to these pixmaps is slower. If you need an X client to use a 24-bit TrueColor visual, the best thing is to write that program to find such a visual instead of reconfiguring the X server to use a 24-bit TrueColor visual. * Compress expose events. Naive redraws will not only be visually unattractive but also waste the X servers time. Motif and Xt do this for you. BACKING STORE and SAVE UNDER Backing store and save under are intended as performance optimizations. In practice, using them may instead lead to performance degradation. Backing store and save under for non-GL windows: * Backing store and save under will never work for IrisGL clients. That is, IrisGL pixels will not be saved and restored. * Backing store and save under will never work for OpenGL that is rendered directly (rather than through the X server). * Backing store and save under may possibly work in a future release, for OpenGL that is rendered through the X server. However, that is definitely not currently committed and may never happen. * The difference in these three cases is the ability to render to pixmaps, based on the window's clip. * IrisGL will never render to pixmaps * Direct-rendered OpenGL clients have neither knowledge of the window's clip, nor the tolerance for any additional level of indirection that would affect rendering performance. * Indirectly rendered OpenGL clients will share the server's knowledge of the clip and can tolerate the extra level of indirection needed to decide whether to render to pixmap or to the screen Backing store and save under for non-GL windows: * Most Xt apps may be better off without using either backing store or save unders. * Consider a control panel with a few buttons and a few strings. Backing store is almost certainly a major lose in 99+% of configurations, because rendering a few rectangles and a little text should be much faster than saving and restoring the affected pixels. * On the other hand, if you have a very expensive, difficult rendering to do, and you're essentially blitting pixels anyway, then backing store can * prevent you from having to recalculate (if your app is dumb enough to have to recalculate) * prevent the pixels from having to go over the net. * The app developer must understand the backing store implementation and think the performance issues through. One place that backing store contributes to poor perceived performance is using window managers with opaque move enabled (mwm and 4Dwm support opaque move). When you move an opaque window over a backing store window, you are asking the server to do lots of backing store operations. The result can be very sluggish opaque moves. If clients handle their own redraws, the movement can be much more fluid because clients catch up on exposes and can compress them. The visual effect is not nearly as jerky. TODO: * Test gadget creation speed. Perhaps this would make a good demo. * SGI has some constraints in what we can do with/to the Motif library: * Motif itself is constrained by Xt and X11. * Motif is a standard, so we can only do things that will not change any application-visible behavior. * We need to merge new releases of Motif several times per year. We cannot make changes that are on a scale that will prevent us keeping up with the standard.